Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change data type for timestamp in Developer lab 2.1. #21

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dianacarroll
Copy link
Collaborator

The solutions used DateTime, which causes an error on insert:

Received exception:
Code: 53. DB::Exception: Type mismatch in IN or VALUES section. Expected: DateTime. Got: Decimal64:  (When parsing Parquet statistics for column TIMESTAMP, physical type 2, column descriptor = {
  name: TIMESTAMP,
  path: TIMESTAMP,
  physical_type: INT64,
  converted_type: TIMESTAMP_MILLIS,
  logical_type: Timestamp(isAdjustedToUTC=true, timeUnit=milliseconds, is_from_converted_type=true, force_set_converted_type=false),
  max_definition_level: 1,
  max_repetition_level: 0,
}. Please report an issue and use input_format_parquet_filter_push_down = false to work around.): (in file/uri datasets-documentation/pypi/2023/pypi_0_7_34.snappy.parquet): While executing ParquetBlockInputFormat: While executing S3Source. (TYPE_MISMATCH)

The DESCRIBE command shows the type in the Parquet file as Nullable(DateTime64). Using DateTime64 in the new table eliminates the error.

Also removed a wayward .DS_Store file from the repo.

Existing data type causes an error "Type mismatch in IN or VALUES section. Expected: DateTime. Got: Decimal64:"
@dianacarroll dianacarroll self-assigned this Sep 17, 2024
@CLAassistant
Copy link

CLAassistant commented Sep 17, 2024

CLA assistant check
All committers have signed the CLA.

@dianacarroll dianacarroll requested review from pmusa and removed request for rfraposa October 1, 2024 15:07
@pmusa
Copy link
Collaborator

pmusa commented Oct 3, 2024

I didn't get this error with the DateTime solution. The dataset timestamp is second based (no ms value), so I think it is good the way it is. Did you get this error on SQL Console?

@dianacarroll
Copy link
Collaborator Author

Yes, I believe I got it in the cloud sql console (it was a while ago).

@pmusa
Copy link
Collaborator

pmusa commented Oct 7, 2024

I couldn't reproduce it. Shall I close it, or do you want to try it again?

@dianacarroll
Copy link
Collaborator Author

dianacarroll commented Oct 7, 2024

I don't know what you mean by "try again". You are thinking maybe if the issue will be fixed if I run it a few times? It isn't fixed. What DOES fix it is the code change in the PR.

image image

@dianacarroll
Copy link
Collaborator Author

dianacarroll commented Oct 7, 2024

The instructions tell students to use the datatypes used in the inferred schema, minus the Nullable.

image

The inferred scheme returns Nullable(DataTime64(3)).

image

So an even better solution would be to use DateTime64(3). I was going for simplicity but if you prefer, let's follow the letter of the instructions. That fixes the issue too.

image

@pmusa
Copy link
Collaborator

pmusa commented Oct 8, 2024

I just couldn't reproduce and still can't. Hence the question. Let's look into it together and try to fix it.

Screenshot 2024-10-08 at 13 13 28

@pmusa
Copy link
Collaborator

pmusa commented Oct 15, 2024

We looked into it, and the error should not happen (the error above is not a training bug). I rather have DateTime instead of DateTime64.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants